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Abstract 



In this paper, an alternative approximation to the innovation method is introduced for the pa- 
rameter estimation ol diffusion processes Irom partial and noisy observations. This is based on a 
' convergent approximation to the first two conditional moments of the innovation process through 

approximate continuous-discrete filters of minimum variance. It is shown that, for finite samples, 
the resulting approximate estimators converge to the exact one when the error of the approximate 
filters decreeises. For an increasing number of observations, the estimators are asymptotically normal 
distributed and their bias decreases when the above mentioned error does it. A simulation study is 
provided to illustrate the performance of the new estimators. 

CN ' 1 Introduction 

CO ■ The statistical inference for diffusion processes described by Stochastic Differential Equations (SDEs) is 

P\J I currently a subject of intensive researches. A basic difficulty of this statistical problem is that, except for 

a few simple examples, the joint distribution of the discrete-time observations of the process has unknown 
closed-form. In addition, if only some components of the diffusion process contaminated with noise are 
observed, then an extra complication arises. Typically, in this situation, the statistical problem under 
consideration is reformulated in the framework of continuous-discrete state space models, where the SDE 
to be estimated defines the continuous state equation and the given observations are described in terms 
of an discrete observation equation. For such class of models, a number of estimators based on analytical 
' and simulated approximations have been developed in the last four decades. See, for instance, Nielsen et. 

al (2000a) and Jimenez et al. (2006) for a review. 

In particular, the present paper deals with the class of innovation estimators for the parameter esti- 
mation of SDEs given a time series of partial and noisy observations. These are the estimators obtained 
by maximizing a normal log-likelihood function of the discrete-time innovations associated with the un- 
derlying continuous-discrete state space model. Approximations to this class of estimators have been 
derived by approximating the discrete-time innovations by means of inexact filters. With this purpose, 
approximate continuous-discrete filters like the Local Linearization (Ozaki 1994, Shoji 1998, Jimenez & 
Ozaki 2006), the extended Kalman (Nielsen & Madsen 2001, Singer 2002), and the second order (Nielsen 
et al. 2000b, Singer 2002) filters have been used, as well as, discrete-discrete filters after the discretization 
of the SDE by means of a numerical scheme (Ozaki & lino 2001, and Peng et al. 2002). The approximate 
innovation estimators obtained in this way have been useful for the identification, from actual data, of 
a variety of neurophysiological, financial and molecular models among others (see, e.g., Calderon 2009, 
Chiarella et al. 2009, Jimenez et al. 2006, Riera et al. 2004, Valdes et al. 1999). However, a common 
feature of the approximate innovation estimators mentioned above is that, once the observations are 
given, the error between the approximate and the exact innovations is fixed and completely determined 
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by the distance between observations. Clearly, this increases the bias of the approximate estimators for 
finite samples and obstructs its asymptotic correction when the number of observations increases. 

In this paper, an alternative approximation to the innovation estimator for diffusion processes is 
introduced, which is oriented to reduce and control the estimation bias. This is based on a recursive 
approximation of the first two conditional moments of the innovation process through approximate filters 
that converge to the linear one of minimum variance. It is shown that, for finite samples, the resulting 
approximate estimators converge to the exact one when the error of the approximate filters decreases. For 
an increasing number of observations, they are asymptotically normal distributed and their bias decreases 
when the above mentioned error does it. 

The paper is organized as follows. In section 2, basic notations and definitions are presented. In section 
3, the new approximate estimators are defined and some of their properties studied. As a particular 
instance, the order-/? innovation estimator based on convergent Local Linearization filters is presented in 
Section 4, as well as algorithms for its practical implementation. In the last section, the performance of 
the new estimators is illustrated with various examples. 

2 Notation and preliminary 

Let {ft,J-,P) be the underlying complete probability space and {Tt, t > to} be an increasing right 
continuous family of complete sub a-algebras of and x be a d-dimensional diffusion process defined by 
the stochastic differential equation 

m 

dx{t) ^ i{t,^{t);9)dt + Y,&^it,^{t);e)dMv\t) (1) 

1=1 

for t > to E M., where f and gi arc differentiable functions, w = (w^, .., w™) is an m-dimensional J-'t- 
adapted standard Wiener process, 9 £ Vg is a. vector of parameters, and Vg C is a compact set. 
Linear growth, uniform Lipschitz and smoothness conditions on the functions f and gi that ensure the 
existence and uniqueness of a strong solution of (HI) with bounded moments are assumed for all 6 e Vg. 

Let us consider the state space model defined by the continuous state equation ([1]) and the discrete 
observation equation 

z(tfe) = Cx(tfc) + et,, forfc = 0,l,..,M-l, (2) 

where {et^. : ef^. ~ A/'(0,n4^), k = 0,..,Af — 1} is a sequence of r-dimensional i.i.d. Gaussian random 
vectors independent of w, Tlt^ an r x r positive semi-definite matrix, and C an r x d matrix. Here, 
it is assumed that the M time instants tk define an increasing sequence {t}M — {tk ■ ik < tk+i, k = 
0,1,..,M-1}. 

Suppose that, through M partial and noisy observations of the diffusion process x defined by ([T]) 
with 6 ~ 6q €z Vg are given on {t}]\i. In particular, denote hy Z — {zt^, .., Ztj^.,_-^} the sequence of these 
observations, where Zt^ denotes the observation at tk for all tk £ {t}Ai. 

The inference problem to be consider here is the estimation of the parameter 6q of the SDE ([l} given 
the time series Z. Specifically, let us consider the innovation estimator defined as follows. 

Definition 1 (Ozaki, 1994) Given M observations Z of the continuous- discrete state space model 
with — 00 £ Vg on {t}M, 

Om ^aTg{minUM{e,Z)} (3) 


defines the innovation estimator ofOo, where 

M-l 

UMiO, Z) = (M - 1) ln(2^) + ^ ln(det(S,J) + «.7jS,J-Vi„ 

k=l 

i>tk is the discrete innovations of the model and Stj. the innovation variance for all tk G {t}M- 
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In the above definition, 



ut, = zt, - Cxt,/t,_, and S*, = CVt,/t,_,C'^ + H*,, 

where y^t^/t^_, = E(jc{tk)\Zt^_,) and Vt^/t^_^ = £;(x(tfc)xT(tfc)|^tfc_i ) - Xt^/t^_^xJ^/^^_^ denote the 
conditional mean and variance of the diffusion process x at tk given the observations Zt^_-^ — {zt^ , .., Zt^-^} 
for all tk~i,tk € {t}M and 6 G Vg. Here, the predictions i?(x(ifc)|Zij._j) and E{x{tk)x.'^ {tk)\Zt^_-^) are 
recursively computed through the Linear Minimum Variance filter for the model Because the first 

two conditional moments of x are correctly specified. Theorem 1 in Ljung & Caines (1979) for prediction 
error estimators implies the consistent and the asymptotic normality of the innovation estimator (jSj under 
conventional regularity conditions (Ozaki 1994, Nolsoe et al. 2000). 

In general, since the conditional mean and variance of equation ([1]) have not explicit formulas, ap- 
proximations to them are needed. If Xjj,/jj,_^ and XJi^/f^_^ are approximations to ^tk/tk-i ^^'^ ^tk/tk-iJ 
then the estimator 

i?M = arg{min{7Af(6>,Z)}, 


with 

Af-l 

Um{0,Z) = (Af- l)ln(2^) + ln(det(S,J) +i57^(S,J,-^i5,, 

fc=i 

provides an approximation to the innovation estimator (|3]), where 

Z't, = zt, - Cxt,/t,_, and Sj, = Cijt,/t,_,C'' + Ut^ 

are approximations to Vt^ and St^. . 

Approximate estimators of this type have early been considered in a number of papers. Approximate 
continuous-discrete filters like Local Linearization filters (Ozaki 1994, Shoji 1998, Jimenez & Ozaki 2006), 
extended Kalman filter (Nielsen & Madsen 2001, Singer 2002), and second order filters (Nielsen et al. 
2000b, Singer 2002) have been used for approximating the values of i/t^ and St^. . On the other hand, 
in Ozaki & lino (2001) and Peng et. al (2002), discrete-discrete filters have also been used after the 
discretization of the equation ([T|) by means of a numerical scheme. In all these approximations, once 
the data Z are given (and so the time partition {t}M is specified), the error between Uf^. and Vt^ 
is completely settled by tk — ifc-i and can not be reduced. In this way, the difference between the 
approximate innovation estimator and the exact one 6m can not be reduced neither. Clearly, this is 
a important limitation of these approximate estimators. Nevertheless, in a number of practical situations 
(see Jimenez & Ozaki 2006, Jimenez et al. (2006), and references therein) the bias the approximate 
innovation estimators is negligible. Therefore, these estimators has been useful for the identification, 
from actual data, of a variety of neurophysiological, financial and molecular models among others as it 
was mentioned above. Further, in a simulation study with the Cox-Ingersoll-Ross model of short-term 
interest rate, approximate innovation methods have provided similar or better results than those obtained 
by prediction-based estimating functions but with much lower computational cost (Nolsoe et al., 2000). 
Similar results have been reported in a comparative study with the approximate likelihood via simulation 
method (Singer, 2002). 

Denote by Cp {W^ , M) the space of / time continuously differentiable functions 5 : K'* — > R for which g 
and all its partial derivatives up to order I have polynomial growth. 



3 Order-/? innovation estimator 

Let {t)i^^q = {t„ : t„+i — r„ < /i, n = 0, 1, . . . , N} be a time discretization of [Iq, ^Af-i] such that (t),^ D 
{t}M, and y„ be the approximate value of x(t„) obtained from a discretization of the equation ([1]) for 
all Tn G i''')h- consider the continuous time approximation y = {y(<), t S [tQ,tM-i] ■ viTn) — Yn 

for all Tn G {t)i^} of X with initial conditions 

E (y{to)\j'to) =E(^{to)\Tto) and E ( y{to)y^ {tJ^t^ = E ( ^{toW (tj^t,) ; (4) 
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satisfying the bound condition 

E(\y{t)\"^\^to] <L (5) 



< Lkh^ (6) 



for all t € [io,^Af-i]; and the weak convergence criteria 



sup 

tk<t<tk+i 



for all tk,tk+i e {t}M and €z Vg, where g € C'^^^^\w^ ,M.) , L and Lk are positive constants, /3 € N+, 
and q — 1, 2.... The process y defined in this way is typically called order-/3 approximation to x in weak 
sense (Kloeden & Platen, 1999). The second conditional moment of y is also assumed to be positive 
definite and continuous for all 9 G Ve- 
in addition, let us consider the following approximation to the Linear Minimum Variance (LMV) filter 
of the model ©-(El). 

Definition 2 (Jimenez 2012b) Given a time discretization (r)^ D {t}M, the order-j3 Linear Minimum 
Variance filter for the state space model ([iP-(0) is defined, between observations, by 

yt/t^E{y{t)/Zt) and V t/t ^ E{yit)y^ {t)/Zt) ~ y^/tyj/^ (7) 
for all t e {tk,tk+i), and by 

ytk+i/tk+i = ytfc+i/tfc + Ktfc+i (zj^^^ - Cy^^^jfj, (8) 

for each observation at tk+i, with filter gain 

K*,^. - v,,^,/,,cT(cv,,^,/,,cT + n,,^j-i (10) 

for all tk,tk+i G {t}M, where y is an order-fi approximation to the solution of ([I]) in weak sense, and 
Zt = {z(tfc) '■ tk < t, tk (z {Om} are given observations of ([2P-(0 until the time instant t. The predictions 
Yt/t^ = E{yit)/Zt^) andNtjtk ^ E{y{t)y'' {t)/Zt^) -yt/t^yj/f^, with initial conditions yt^t^ andVt^/t^, 
are defined for all t e {tk,tk+i] and tk,tk+i G {t}M- 

Once an order-/? approximation to the solution of equation ([1} is chosen, and so an order-/? LMV filter 
is specified, the following approximate innovation estimator can naturally be defined. 

Definition 3 Given M observations Z of the continuous- discrete state space model ([2P-(0 with 6 = 6q G 
Vg on {t}M, the order-jS innovation estimator for the parameters of (QP is defined by 

eM{h) = arg{minC/Mj.(6>, Z)}, (11) 



where 



M-l 

UmAO.Z) = (Af - l)ln(2^) + Hdcl{i:nu))+^ltS^hMrWt., 

k=l 

with Uh^tk = ztfc - ^ytk/tk-i°''^'^ ^h^tk = CVt^/t^_^CT + Ut^, being ytjtk-i o-^d "Vt^/t^_^ the prediction 
mean and variance of an order-j3 LMV filter for the model (OP-©'; o,nd h the maximum stepsize of the 
time discretization (t)^ D {t}M associated to the filter. 

In principle, according to the above definitions, any kind of approximation y converging to x in a 
weak sense can be used to construct an approximate order-/? LMV filter and so an approximate order-/3 
innovation estimator. In this way, the Euler-Maruyama, the Local Linearization and any high order 
numerical scheme for SDEs as those considered in Kloeden & Platen (1999) might be used as well. 
However, the approximations Vh^tk and '^h,t^ to Vt^ and St^. in ^ at each tk will be now derived from 
the predictions of approximate LMV filter after various iterations with stepsizes lower than t^ — tk-i- 
Note that, when (r)^ = {t}M, an order- /3 LMV filter might reduce to some one of the conventional 
approximation to the exact LMV filter. In this situation, the corresponding order-/? innovation estimator 
reduces to some one of the approximate innovation estimator mentioned in Section O In particular, to 
those considered in Ozaki (1994), Shoji (1998), Jimenez & Ozaki (2006) when Local Linearization schemes 
are used to define order- /3 LMV filters. 
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3.1 Convergence 

For a finite sample Z of M observation of the state space model ([I])-®, Theorem 5 in Jimenez (2012b) 
states the convergence of the order-/? LMV filters to the exact LMV one when h decreases. Therefore, the 
convergence of the order-/? innovation estimator to the exact innovation estimator is predictable when h 
goes to zero. 

Theorem 4 Let Z he a time series of M observations of the state space model ([2P-(0 ^^^^ S — So on 
the time partition {t}M- Let 6m and OM{h) be, respectively, the innovation and an order-(3 innovation 
estimator for the parameters of fIJ) given Z. Then 



M 



as h ^ 0, for all fixed Z . 

Proof. Defining A'S^.t^ — "^t^ ~ ^/i.tt i it follows that 

det(S^,tJ = det(St, -ASmJ 

= det(SiJdet(I-S-iAS^,iJ (12) 

and 

= sr/ + sr/AS;,,t,(i - (13) 

By using these two identities and the identity 



+ - M/MJVS/..tJ"'(Mt, - M/mJ' (14) 
with fj.^.^ = Cxt^/t,_, and = Cy^^/t^ .^, it is obtained that 

UmAS, z) - Uuie, Z) + RuA^). (15) 
where Um and UM,h are defined in ([3]) and (|lip. respectively, and 

RM,h{S) - ^ ln(det(I-Sr>S^,tJ) + (zt, -/x,JTM;,^,,(z,^-/x,J 
fc=i 

+(Mtfc - /^ft,tJVS'»,tJ"^(/^tfc - 
with M;,,,, = sr>i:„.t,(i- s-^As^,tj-isr.'- 

Theorem 5 in Jimenez (2012b) deals with the convergence of the order-/? filters to the exact LMV one. 
In particular, for the predictions, it states that 



y*,/*.-. I < and |U,,/,,_^ - V,,/,,_^ I < Kh^ 



for all tk,tk+i G {t}M, where if is a positive constant. Here, we recall that Xt^^t^_^ and Ut^/fj,_j are the 
predictions of the exact LMV filter for the model whereas Ytk/tk^i and ytk/tk-i are the predictions 

of the order-/? filter. From this and taking into account that fi^^ — fif^^^ — C(Xj^yj^ ^ — Ytk/tk-i) and 
AS,,,, = C(U,^/,^_^ - V,,/,,_JCT follows that 

l^tfc - I ^ and - S,i,t, I ^ 
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as /i — >■ for all 9 <E Vg and k — 1, .., Af — 1. This and the finite bound for the first two conditional 
moments of x and y imply that RM,h{d) as well with h. From this and ([T5]), 



arg{min{J7M(^, Z) 
e 



RmA^)}} - arg{ mini7M(0, Z)} 









as ft. -> 0, which concludes the proof. ■ 

The above theorem states that, for each given data Z, the order-/3 innovation estimator 9M{h) con- 
verges to the exact one 6m as h goes to zero. Because h controls the weak convergence criteria ([6]) is then 
clear that the order-/3 innovation estimator ([TT]) converges to the exact one ([3]) when the error (in weak 
sense) of the order-/3 approximation y to x decreases or, equivalently, when the error between the order-/3 
and the LMV filter decreases. This is a remarkable positive feature that the conventional approximations 
to the innovation estimator mentioned in Section [5] do not have. 



3.2 Asymptotic properties 

In this section, asymptotic properties of the approximate innovation estimator 0m (ft) will be studied by 
using a general result obtained in Ljung and Caines (1979) for prediction error estimators. According to 
that, the relation between the estimator O^jih) and the global minimum 6\j of the function 

Wm{0) = E{Um{0, Z)) with e<^Vg (16) 

should be considered, where Um is defined in Q. Here, it is worth to remark that 0\[ is not an estimator 
of since the function Wm does not depend of a given data Z. In fact, 0\j indexes the best predictor, 
in the sense that the average prediction error loss function Wm is minimized at this parameter (Ljung & 
Caines, 1979). 

In what follows, regularity conditions for the unique identifiability of the state space model U)-© 
are assumed, which are typically satisfied by stationary and ergodic diffusion processes (see, e.g., Ljung 
& Caines, 1979). 

Lemma 5 // Tttf, is positive definite for all /c = 1, .., M — 1, then the function Wm{0) defined in hlb]) 
has an unique minimum and 

arg{ min M^m(6')} = 6»o. (17) 

eeVe 

Proof. Since is positive definite for all k = l,..,Af — 1, Lemma A. 2 in BoUerslev & Wooldridge 
(1992) ensures that 6q is the unique minimum of the function 

?fc(0)=i?(ln(det(StJ) + i/7^(S,J-X|Z,,_J 

on Vg for all fc, where Ut,, ~ — C:x.t^^/t,,_i and = C\5tt./tk-i^'^ + Ht^.. Consequently and under 
the assumed unique identifiability of the model (Il])-(l2]), 0^ is then the unique minimum of 

A/-1 

WM{e) - (A/ - 1) ln(2^) + E{lk{e)) 

k=l 

on Vg. m 

Denote by U'j^j ^ the derivative of UM,h with respect to 0, and by Wm the second derivative of Wm 
with respect to 9. 

Theorem 6 Let Z be a time series of M observations of the state space model ([2])-([3) with 9 — 9q on 
the time partition {t}M- Let 9M{h) be an order-13 innovation estimator for the parameters of {!]) given 
Z. Then 

dM{h)~9o^ A9Mih) (18) 

w.p.l as M — oo, where A0M(ft) — >■ as ft — !■ 0. Moreover, if for some AIq G N there exists e > such 
that 

WM{9)>eI and UmAG) = ME{U'mA<^, Z)iU'M^^i9, Z)y) > el (19) 
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for all M > Mq and G eVq, then 

^PM,h(dM{h) -Oo)^ Af{A9M{h),I) (20) 

flsM-^oo, wherePMM = (VK;;(6»o + A0M(/i)))"'HM,/.(eo + A6»m W)(VP^m(^'o + A0m(M))"' + APm,/. 
with AFM,h -> as /i — > 0. 

Proof. Let WM,h{d) = E{Uni^hi9, Z)) and otMih) = aTg{ mm WM.hid)}, where Umji is defined in ([TT|) . 

9EVg 

For a ft. fixed, Theorem 1 in Ljung & Caines (1979) implies that 

dM{h)-aM{h)^0 (21) 

w.p.l as A/ — >■ oo; and 

VMP^/hiccMihWMih) - aM(ft)) ~ AA(0,I) (22) 

as M — ^ oo, where 

PmjM - (nU(^))"' Hm,/.(0) (W^IUW)"' 

with HM,h(0) = Afi?((7;,,,(0,z)(c/;,^,,(0,z))T). 

By using the identities ([T2l) -(fT4 |) . the function 

A/-1 

WmAO) = (M ~ l)ln(2^) + ^ i?(ln(det(S,,tJ) + (z*, - /x,_,JT(s,_,J-1(z,^ - 

k=l 

with /X;, = Cyj^/t^_^, can be written as 

WM,kW = W^m(6') + E{RM,hm, (23) 

where VFm is defined in (fTB]) and 

RmAO) = E £^(ln(det(I - S,->S;,,tJ)|J-*,_J + i?((z*, - /x, Jtm,,,, Jz,^ - tiJ\Ft,_,) 

k=l 

with M,,,, = S,->S,.,,(I - Sr>S,.,tJ-iSr,\ Mt, = Cx,,/,,_^ and AS;,,^, = S,, - S,,^,. 
Denote by W'li ^ and ^ the second derivative of WM,h and i?M,/i with respect to 9. 
Taking into account that 

= (VKl^(0) + i?(i?^,,,(0)))-i 

= (W^m(^))"'+Km./.(0) 

with 

Km,.(0) = -(M^;',(0))-i£;«,,,(0))(I+ (Ty;^(0))-ii?(i?X,,,(e)))-i(M/;^(0))-\ 
it is obtained that 

PmAO) = iw'M{e))-'UM,hmw'M{e))-' + apm,/.(0), (24) 

where 

APM,hW = Km,4^)Hm,/.(0)(T^m(0))"' + (1^m(0))-^Hm./.(0)Km,/.(0) +KM,h(e)HAf,/.(0)KM,h(0). 

Theorem 5 in Jimenez (2012b) deals with the convergence of the order-/3 filters to the exact LMV one. 
In particular, for the predictions, it states that 

|xt,/t,_, - yt,/t._i I < Kh^ and |Ut^/t,_, - Vt,/t,^^ \ < Khl^ 
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for all tk,tk+i (z {t}M, where if is a positive constant. Here, we recall that Xt^/t^_^ and \Jt^/tk_i sue the 
predictions of the exact LMV filter for the model ([I])-®, whereas yt^/tk-i ^nd "Vf^/t^.^ are the predictions 
of the order-/? filter. From this and taking into account that fj,^^ — fJ-htk ~ ^(-^tk/tk i ~ ytk/tk-i) ^^.d 
ASm. = C(U,,/,,_, - V,,/,,_JCT follows that 

I /^tfc - J ^ and I Sf , - J ^ 

as h ^ for all 9 E Vg and k — 1, .., A/ — 1. This and the finite bound for the first two conditional 
moments of x and y imply that \RM.h{d, Z)\ and R'^j f^{6, Z) as well with h. From this and 
((23|). it is obtained that 

WMAe)^WM{0) and W^M,J0) -> as h 0. (25) 

In addition, left (|25|) and Lemma [5] imply that 

A6»Af(/t)==aM(/i)-0o = arg{minW^A/,,,(6>)}-arg{minVKM(6')}^O as h ^ 0, (26) 



whereas from right ([25|) follows that 

APM,hW^O as /i->0. (27) 



Finally, (|26l)-(122l) together (|2T]), and ^ imply that (HH]) and ^ hold, which completes the 
proof. ■ 

Theorem [S] states that, for an increasing number of observations, the order-/? innovation estimator 
OmW is asymptotically normal distributed and its bias decreases when h goes to zeros. This is a 
predictable result due to the asymptotic properties of the exact innovation estimator 6m derived from 
Theorem 1 in Ljung & Caines (1979) and the convergence of the approximate estimator ©jif (/i) to 9m 
given by Theorem 2] when h goes to zero. Further note that, when h = 0, the Theorem [5] reduces to 
Theorem 1 in Ljung & Caines (1979) for the exact innovation estimator 6m- This is other expected result 
since the order-/? innovation estimator 6M{h) reduces to the exact one 9m when h = 0. 

3.3 Models with nonlinear observation equation 

Previous definitions and results have been stated for models with linear observation equation. However, 
by following the procedure proposed in Jimenez and Ozaki (2006), they can be easily applied as well to 
state space models with nonlinear observation equation. 

For illustrate this, let us consider the state space model defined by the continuous state equation ([T]) 
and the discrete observation equation 

z(tfc) = h(tfc,x(tfe))-f et,, forfc = 0,l,..,M-l, (28) 

where e^^ is defined as in ([2]) and h : M x M'' — > M'' is a twice differentiable function. By using the Ito 
formula, 

m 
s=l 

with i = 1, .., r. Hence, the state space model p]) + (|28|) is transformed to the following higher-dimensional 
state space model with linear observation 



dv(i) = Si{t,w{t))dt + Eb,(t, v(<))dw*(t). 
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z{tk) - Cv(4-) + et„ for fc = 0, 1, .., M - 1, 



where 



X 




■ f 


, bi = 


St 




, a = 






h 




. P . 





and the matrix C is such that h.{tk,x{tk)) = Cv{tk). 

In this way, the state space model ([T|)+([28|) is transformed to the form of the model ([I])-©, and so 
the previous definition and results related to the order-/3 innovation estimator can be applied. Further, 
note that if the nonlinear function h depends of unknown parameters, they can be estimated as well by 
the approximate innovation method. 



3.4 Models with noise free complete observations 

This section deals with the particular case that the observation noise is zero and all components of the 
diffusion process defined in (IT|) are discretely observed. That is, when C = I and Ilt^. = in ([2]) for all 
fc, where I denotes the d-dimensional identity matrix. Hence, the inference problem under consideration 
in this paper reduces then to the well known problem of parameter estimation of diffusion processes from 
complete observations. In this situation, it is easy to realize that the innovation estimator ([3]) reduces to 
the well known quasi-maximum likelihood (QML) estimator for SDEs, and that the approximate order- 
(3 innovation estimator reduces to the approximate order-/? QML estimator introduced in Jimenez 
(2012c) for the estimation of SDEs from complete observations. For the same reason, Theorems |4] and [6] 
reduce to Theorem 2 and 4 in Jimenez (2012c) concerning the convergence and asymptotic properties of 
the approximate order-/? QML estimator. 



4 Order-/? innovation estimator based on LL filters 

Since, in principle, any approximate filter converging to LMV filter of the model (HJ-© can be used to 
construct an order-/? innovation estimator, some additional criterions could be considered for the selection 
of one of them. For instance, high order of convergence, efficiency of the algorithm from computational 
viewpoint, and so on. In this paper, we elected the order-/? Local Linearization (LL) filters proposed in 
Jimenez (2012b) for the following reasons: 1) their predictions have simple explicit formulas that can be 
computed by means of efficient algorithm (including high dimensional equations); 2) their predictions are 
exact for linear SDEs in all the possible variants (with additive and/or multiplicative noise, autonomous 
or not); 3) they have an adequate order ^ = 1,2 of convergence; and 4) the better performance of the 
approximate innovation estimators based on conventional LL filters (see, e.g., Ozaki, 1994; Shoji, 1998; 
Singer, 2002). 

According to Jimenez (2012b), the order-/? LL filter is defined on (r)^ D {t}M in terms of the order-/? 
Local Linear approximation y that satisfies the conditions (j!])-®. Denote by yr^/tk — ^(y('''n)l^ti ) ^^'^ 
'^Tn/tk = E{y{T„)'y'^ {Tn)\Zti^) the first two conditional moment of y at t„ given the observations Zt^, 
for all r„ G {(r)^ n [tk,tk+i]} and A: = 0, ..,M ~ 2. 

Starting with the initial filter values yta/ta — ^to/to ^^'^ -Pto/to — Qto/to' filter algorithm 

performs the recursive computation of : 

1. the predictions y^^^^f^and Pr„/tfc for all r„ e {{''')h ^ i^k, ^fe+i]} by means of the recursive formulas 
(|40)) - (|4ip given in the Appendix, and the prediction variance 

2. the filters 



ytk+i/tk+i - ytk+i/tk + Kti_+i(zj^^^ - Cy^^^^/tj, 
Ptfc+i/tfc+1 ~ "^tfc+i/tfc+i ~^ ytk+i/tk+iytf,^i/tk+i^ 
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with filter gain 

for eacli fc, with = 0, 1, . . . , A/ — 2. 

Under general conditions, the convergence of the order-/? LL filter to exact LMV filter when h goes to 
zero has been stated by Theorem 10 in Jimenez (2012b). Hence, Theorem |4] implies that the LL-based 
innovation estimator 

duih) = arg{minC/M,/.(0, Z)}, (29) 



with 

M-l 

UmAO, Z) = {M - 1) ln(27r) + ^ ln(det(S,,/,,_ J) + (z,, - Cy,^/,^_ J-i(z,, - Cy,^/,,_ J, 

fc=i 

converges to the exact one ([3]) as /i goes to zero for all given Z, where Htk/t^^i — ^^t^/t^^iC'^ + ^tk- 
For the same reason, this order-/? innovation estimator has the asymptotic properties stated in Theorem 

m 

Note that, when (r)^ = {t}M^ the order-/? LL filter reduces to the conventional LL filter. In this 
situation, the order-/? innovation estimator (j29p reduces to the conventional innovation estimators of 
Ozaki (1994) or Shoji (1998) for SDEs with additive noise, and to that of Jimenez and Ozaki (2006) 
for SDEs with multiplicative noise. It is worth to emphasize here that, for each data z^, the formulas 
(j40|) -(|4T |) for the predictions are recursively evaluated at all the time instants t„ e {(r)^ n {tk,tk+i\] for 
the order-/? estimator, whereas they are evaluated only at tk+i = {{t)}^ H (tfe,tfe+i]} for the conventional 
ones. In addition, since the predictions of the order-/? LL filter are exact for linear SDEs, the order-/? 
innovation estimator (j29l) reduces to the maximum likelihood estimator of Schweppe (1965) for linear 
equations with additive noise. 

In practical situations, it is convenient to write a code that automatically determines the time dis- 
cretization (r)^ for achieving a prescribed absolute (atoly, atolp) and relative (rtoly, rtolp) error tolerance 
in the computation of ytk+i/tk and Pt^^^/tj. . With this purpose the adaptive strategy proposed in Jimenez 
(2012b) is useful. 

5 Simulation study 

In this section, the performance of the new approximate estimators is illustrated, by means of simulations, 
with four test SDEs. To do so, four types of innovation estimators are computed and compared: 1) the 
exact one ([3]), when it is possible; 2) the conventional one based on the LL filter. That is, the estimator 
defined by with (t)^ = {i}M and /? = 1; 3) the order-1 innovation estimator with various uniform 
time discretizations {t)^j.', and 4) the adaptive order-1 innovation estimator (j29p with the adaptive 
selection of time discretizations (r) proposed in Jimenez (2012b). For each example, histograms and 
confidence limits for the estimators are computed from various sets of discrete and noisy observations 
taken with different time distances (sampling periods) on time intervals with distinct lengths. 

5.1 Test models 

Example 1. State equation with multiplicative noise 

dx — atxdt + aVtxdwi (30) 

and observation equation 

zt, ^x{tk)+et,, forfc = 0,l,..,Af-l, (31) 

with a = —0.1, (7 = 0.1 and observation noise variance 11 = 0.0001. For this state equation, the predictions 
for the first two conditional moments are 
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where the fihers Xtf,/tk and Qt^/tk are obtamed from the weh-known formulas of the exact LMV fihcr for 
aU A: = 0, 1, .., M — 2, with initial values Xt^/t^ = 1 and Qto/u, = 1 at to = 0.5. 
Example 2. State equation with two additive noise 

dx = atxdt + at^e°'*-^/'^dwi + pVtdw2 (32) 

and observation equation 

zt^ = x{tk) + et^^, for fc = 0, 1,..,M- 1, (33) 

with a = —0.25, a = 5, p = 0.1 and observation noise variance 11 = 0.0001. For this state equation, the 
predictions for the first two conditional moments are 

and 

2 2 2 

Wtk+i/tk - y'^tk/tk^ 2a' 5 ^ ''■+^ 2a 

where the filters x^^/t^ and Qt^/tk obtained from the formulas of the exact LMV filter for all k = 
0, 1, .., AI — 2, with initial values Xt^/t^ — 10 and Qto/ta — 100 at to = 0.01. 
Example 3. Van dor Pool oscillator with random input (Gittcrman, 2005) 

dxi = X2dt (34) 

dx2 = {—{xf — l)x2 — xi + a)dt + adw (35) 

and observation equation 

Zt^ = xi(tk) + et^, for = 0,l,..,Af- 1, (36) 

where a = 0.5 and = (0.75)^ are the intensity and the variance of the random input, respectively. In 
addition, 11 = 0.001 is the observation noise variance, and ^J^/tg — [1 1] and Qto/tg = ^to/to^to/to 
initial filter values at to = 0. 

Example 4. Van der Pool oscillator with random frequency (Gitterman, 2005) 

dxi — X2dt (37) 

dx2 = {~{x\ — l)x2 — axi)dt + axidw (38) 

and observation equation 

zt,=xiitk) + et,, for A: = 0,l,..,Af- 1, (39) 

where a = 1 and = 1 are the frequency mean value and variance, respectively. 11 = 0.001 is the 
observation noise variance, and xj^/j^ = [1 1] and Qto/tp = Xj^/j^xJ^^^^ are the initial filter values at 
to = 0. 

In these examples, autonomous or non autonomous, linear or nonlinear, one or two dimensional SDEs 
with additive or multiplicative noise are considered for the estimation of two or three parameters. Note 
that, since the first two conditional moments of the SDEs in Examples 1 and 2 have explicit expressions, 
the exact innovation estimator ([3]) can be computed. 

These four state space models have previously been used in Jimenez (2012b) to illustrate the conver- 
gence of the order-/? LL filter by means of simulations. Tables with the errors between the approximate 
moments and the exact ones as a function of h were given for the Examples 1 and 2. Tables with the 
estimated rate of convergence were provided for the fours examples. 



5.2 Simulations with one-dimensional state equations 

For the first two examples, 100 realizations of the state equation solution were computed by means of the 
Euler (Kloeden & Platen, 1999) or the Local Linearization scheme (Jimenez et al., 1999) for the equation 
with multiplicative or additive noise, respectively. For each example, the realizations where computed 
over the thin time partition {to + lO^^rt : n = 0, ..,30 x 10^} to guarantee a precise simulation of the 
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stochastic solutions on the time interval [^Oi^o + 30]. Twelve subsamples of each realization at the time 
instants {t} m,t = {tk = to + kT/M : k = 0, .., M — 1} were taken for evaluating the corresponding obser- 
vation equation with various values of M and T. In particular, the values T = 10,20,30 and M = T/S 
with (5=1, 0.1, 0.01, 0.001 were used. In this way, twelve sets of 100 time series Zg j, = {zl^ : tk £ {t}M,T, 
M = T/5}, with i = 1,..,100, of M observations zl^ each one were finally available for both state 
space models to make inference. This will allow us to explore and compare the performance of each esti- 
mator from observations taken with different sampling periods S on time intervals with distinct lengths T. 



6 = 1 


h = d 


h = 5/2 


h = (5/8 


h = (5/32 


T =10 
a T = 20 
T = 30 


7.5 ±5.5 X 10-^ 


1.8 ± 1.2 X 10"^ 


2.9 ±2.3 X 10"-* 


6.8 ±5.6 X 10"^ 


7.7 ±8.0 X 10-^ 


1.7 ± 1.2 X 10"^ 


2.7 ±2.2 X 10--* 


6.4 ± 5.3 X lO"'^ 


7.1 ± 5.2 X 10"^ 


1.7 ± 1.2 X 10"^ 


2.7 ±2.2 X lO-'^ 


6.3 ±5.3 X 10-^ 


r = 10 

a T = 20 

r = 30 


3.2 ± 1.9 X 10-^ 


1.0 ±0.6 X 10"^ 


2.1 ± 1.1 X 10"^ 


5.1 ±2.6 X 10-* 


3.2 ± 1.9 X 10-^ 


1.0 ±0.6 X 10"^ 


2.1 ± 1.1 X 10"^ 


5.1 ±2.6 X 10-* 


3.2 ± 1.9 X 10-^ 


1.0 ±0.6 X 10"^ 


2.1 ± 1.1 X 10"^ 


5.1 ± 2.6 X 10-* 



equation (I30|l . h = 6, for the conventional; and h = S/2, (5/8, (5/32, for the order-1 on [t] 



h,T- 



6 = 1 


a 


(T 


h 


T =10 T = 20 T = 30 


T = 10 T = 20 T = 30 


6 


-0.00403 -0.00433 -0.00373 


-0.0321 -0.0321 -0.0321 


6/2 


-0.00083 -0.00077 -0.00077 


-0.0107 -0.0106 -0.0106 


6/8 


-0.00004 -0.00002 -0.00002 


-0.0021 -0.0021 -0.0021 


(5/32 


0.00001 


-0.0005 -0.0005 -0.0005 




-0.00010 -0.00014 -0.00010 


-0.0003 -0.0002 -0.0003 



Table II: Difference between the averages of the exact and the approximate innovation estimators for the 



equation (|30p . h = 6, for the conventional; h 
adaptive order-1 on (r) j,. 



(5/2,(5/8,(5/32, for the order-1 on (t)^ y; and h = for the 



Figure 1 shows the histograms and the confidence limits for both, the exact (S^ 7-) and the conventional 
[ois^t) innovation estimators of a computed from the twelve sets of 100 time series j, available for the 
example 1. Figure 2 shows the same but, for the exact (/B^t) the conventional ((?5,t) innovation 
estimators of a. As it was expected, for the samples Zl j, with largest sampling periods, the parameter 
estimation is distorted by the well-known lowpass filter effect of signals sampling (see, e.g., Oppenheim 
(fe Schafer, 2010). This is the reason of the under estimation of the variance dg ^ from the samples Zg j,, 
with 6=1 and T = 10, 20, 30, when the parameter a in the drift coefficient of (pO| is better estimated by 
cig rp. Contrarily, from these samples, the conventional innovation estimators ag^T can not provided a good 
approximation to a, and so the whole unexplained component of the drift coefficient of pop included in 
the samples is interpreted as noise by the conventional estimators. For this reason, ct^.t over estimates the 
value of the parameter a. Further, note that when the sampling period 6 decreases, the difference between 
the exact {a^rp^d^j,) and the conventional (S^.t, <?i5,t) innovation estimators decreases, as well as the 
bias of both estimators. This is also other expected result. Here, the bias is estimated by the difference 
between the parameter value and the estimator average, whereas the difference between estimators refers 
to the histogram shape and confidence limits. 

For the data of pop with largest sampling period (5=1, the order-1 innovation estimators (SJJ s T^'^li s t) 
and (3._5,T, f?-,(5,T) on uniform (r)J^y = {t„ = to + nh : n = 0,..,T/K} D {t}T/s,T and adaptive 
(r) D {t}x/s,T time discretizations, respectively, were computed with h = (5/2,(5/8,(5/32 and toler- 
ances Holy = rtol-p = 5 X IQ-^ and atoly = 5 x 10~^, atolp = 5 x 10^^^. For each data Zgj,, with 
i = 1, .., 100, the errors 

£i{a,h,6,T) = 



T-'^h,s,T 'And e^{a,h,6,T) = af r-'^l^s.T 



between the exact {afrp, afrp) and the approximate (a^ s T^'^h s t) innovation estimators were computed. 
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Average and standard deviation of these 100 errors were calculated for each set of values h, 6, T specified 
above, which are summarized in Table I. Note as, for fixed T, the average of the errors decreases as h does 
it. This clearly illustrates the convergence of the order-1 innovation estimators to the exact one stated 
in Theorem m when h goes to zero. In addition. Figure 3 shows the histograms and the confidence limits 
for the order-1 innovation estimators ici^^s,TT^h,s,T) ^^'^ (S.,5,t, (?-,5,t) for each set of values h,5,T . By 
comparing the results of this figure with the corresponding in the previous ones, the decreasing difference 
between the order-1 innovation estimators {(^^ s t i^li. s t) ^'^'^ exact one (cifTi^fT) observed as 
h decreases, which is consistent with the convergence results of Table I. These findings are summarized 
in Table II, which shows the difference between the averages of the exact and the approximate innova- 
tion estimators. Further, note the small difference between the adaptive estimators {a.^s,T,^-,5,T) and 
the exact ones (afj^^afrp)^ which illustrates the usefulness of the adaptive strategy for improving the 
innovation parameter estimation for finite samples with large sampling periods. The number of accepted 
and fail steps of the adaptive innovation estimators at each tk G {t}T/s,T are shown in Figure 4. 



S = 0.1 


a 


(J 


P 


h 


T = 10 T = 20 T = 30 


r = 10 T = 20 T = 30 


T = 10 T = 20 T = 30 


6 


0.00039 0.00031 0.00029 


-0.0311 -0.0291 -0.0287 


-2.13x10"^ 2.4 xlO-'^ 3.3x10"" 


S/2 


0.00010 0.00007 0.00007 


-0.0067 -0.0059 -0.0060 


-0.54x10"* 1.4x10"=' 1.8x10"" 


6/A 


0.00003 0.00002 0.00001 


-0.0013 -0.0012 -0.0012 


-0.05 X lO""' 1.0 X 10"^' 1.1 X 10"" 


6/8 


0.00001 


-0.0002 -0.0001 -0.0001 


0.03 X 10"" 0.6 X 10"^' 0.6 x 10"" 




-0.00005 0.00002 0.00009 


-0.0023 -0.0106 


2.14x10""' 2.8x10"" 9.4x10"" 



Table III: Difference between the averages of the exact and the approximate innovation estimators for the 
equation p2|l. h = S, for the conventional; h = i5/2, 5/4, 5/8, for the order-1 on (t)^ j,; and /i = •, for the 
adaptive order-1 on (r). j,. 



6 = 0.1 


h = S 


h = S/2 


h = S/4: 


h = d/8 


T = 10 
a T = 20 
T = 30 


5.2 ±4.0 X 10-'' 


1.1 ±0.9 X 10-" 


2.7 ±2.0 X 10"^ 


7.4 ± 5.6 X 10-** 


5.5 ±4.0 X 10-4 


1.2 ±0.8 X 10-" 


2.6 ± 1.8 X 10"^ 


7.1 ± 5.9 X 10-" 


5.4 ±3.9 X 10-'' 


1.1 ±0.8 X 10-" 


2.6 ± 1.8 X 10-^ 


6.7 ± 5.4 X 10-" 


r = 10 

a T = 20 

r = 30 


4.8 ±3.5 X 10-^ 


9.7 ±6.6 X 10-^ 


1.8 ± 1.4 X 10"^ 


3.6 ±3.6 X 10-" 


4.9 ±3.5 X 10-^ 


1.0 ±0.7 X 10-^ 


1.9 ± 1.5 X 10-^ 


3.9 ±4.7 X 10-" 


4.9 ±3.4 X 10-^ 


1.0 ±0.7 X 10-^ 


1.9 ± 1.4 X 10-=* 


3.7 ±3.8 X 10-" 


r = 10 

p T = 20 

r = 30 


0.8 ± 1.2 X 10-^ 


1.9 ±2.6 X 10"" 


3.9 ±5.0 X 10-^ 


9.9 ±8.9 X 10-" 


1.3 ± 1.2 X 10-"* 


3.7 ±3.0 X lO-'' 


1.3 ±0.5 X 10-^ 


6.6 ± 2.0 X 10-" 


7.5 ±5.7 X 10-^ 


2.5 ± 1.2 X lO-'^ 


1.1 ±0.3 X 10-^ 


6.0 ± 1.3 X 10-" 



Table IV: Confidence limits for the error between the exact and the approximate innovation estimators of the 
equation (I32|l . h = S, for the conventional; and h = S/2, S/A, S/8, for the order-1 on (t)^ rp. 

Figure 5 shows the histograms and the confidence limits for both, the exact (afrp) and the conventional 
{as,T) innovation estimators of a computed from the twelve sets of 100 time series Zg ^ available for the 
example 2. Figure 6 shows the same but, for the exact ((rf'T^) and the conventional {^s,t) innovation 
estimators of a, whereas Figure 7 does it for the estimators and of p. Note that, for this example, 
the diffusion parameters a and p can not be estimated from the samples Zg rp with the largest sampling 
period <5 = 1. From the other data with sampling period 5 < I, the tree parameters can be estimated 
and, the bias of the exact and the conventional innovation estimators is not so large as in the previous 
example. Nevertheless, in this extreme situation of low information in the data, the order-1 innovation 
estimators is able to improve the accuracy of the parameter estimation when h decreases. This is shown 
in Figure 8 for the samples Zg j, with 5 = 0.1 and T = 10, 20, 30, and summarized in Table III. The order- 
1 innovation estimators {oLX,,s,T^^h,s,Tilhi.s,T) and {a.^s^T,^-,5,T,'p.^s,T) are again computed on uniform 
(t)^ J, D {Ot/(5,t and adaptive (r) ^ D {t}T/s.T time discretizations, respectively, with T = 10,20,30, 
h = (5/2,(5/4,(5/8 and tolerances rtoly = rtol-p = 5 x 10"'' and atoly = 5 x 10~^°, atol-p = 5 x 10-^^. 
The average of accepted and fail steps of the adaptive innovation estimators at each tj. e {Ot/s.t are 
shown in Figure 4. Observe in Table III the higher difference between the averages of the exact and the 
adaptive estimators for the three parameters when T = 30. The reason is that, for t^. > 200, the mean 
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and variance of the diffusion process (1321) becomes almost indistinguisliable of zero in sucli a way tliat tlie 
signal noise ratio is very small. This is so small that the adaptive strategy computes inaccurate estimates 
of the integration errors for the predictions and so less accurate estimators for the parameters of the SDE 
(j32p . For this example, the convergence of the order- 1 innovation estimators to the exact one is shown in 
Table IV, which gives the confidence limits for the error between theses estimators for different values of 
h. 

5.3 Simulations with two-dimensional state equations 

For the examples 3 and 4, 100 realizations of the state equation solution were similarly computed by 
means of the Local Linearization and the Euler scheme, respectively. For each example, the realizations 
where computed over the thin time partition {^o + 10~'^n '■ n = 0,..,30 x 10*} for guarantee a precise 
simulation of the stochastic solutions on the time interval [to, to + 30]. Two subsamples of each real- 
ization at the time instants {t}M.T — {tk — to + kT/M : k = 0,..,Af — 1} were taken for evaluating 
the corresponding observation equation, with T = 30 and two values of M. In particular, M = 30, 300 
were used, which correspond to the sampling periods 5 = 1,0.1. In this way, two sets of 100 time series 
Zg rp = {zl^ : tk e {t}M,T, M = T/S}, with i = 1, .., 100, of M observations zl^ each one were available 
for both state space models with the two values of (S, T) mentioned above. 



T = 30 


a 


a 


h 


6=1 5 = 0.1 


5=1 5 = 0.1 


5 


-0.4588 -0.1403 


-0.7240 -0.0140 


5/16 


-0.1244 -0.0026 


-0.2180 0.0103 


5/64 


-0.0336 0.0041 


-0.1883 0.0104 




-0.0108 0.0064 


-0.1803 0.0099 



Table V: Bias of the approximate innovation estimators for the equation (|34p - (l35p . h = 6, for the conventional; 
h = 5/16, 5/64, for the order-1 on (t)^ ^; and h = ■, for the adaptive order-1 on (t). j,. 



T = 30 


a 


a 


h 


6=1 5 = 0.1 


5=1 5 = 0.1 


5 


-0.8511 -0.2740 


-1.0347 -0.0239 


5/8 


-0.2488 -0.0662 


-0.3107 0.0071 


5/32 


-0.1887 -0.0472 


-0.2857 0.0072 




-0.1550 -0.0373 


-0.2805 0.0084 



Table VI: Bias of the approximate innovation estimators for the equation (I37|l - (|38l ). h = 5, for the conven- 
tional; h = 5/8, 5/32, for the order-1 on (t)^ j,; and h = for the adaptive order-1 on (r). rp. 

For both examples, the order-1 innovation estimators {ct^ s t ^'^1'L s t) {oi-,s,T,'S-.s.T) on uniform 
(r)J^ y D {t}T/s,T and adaptive (r) ^ D {t^T/s.T time discretizations, respectively, were computed from 
the two sets of 100 data ^ with T = 30 and 5=1, 0.1. The values of h were set as /i = 5, 5/16, 5/64 
for the example 3, and as /i = 5, 5/8, 5/32 for the example 4. The tolerances for the adaptive estimators 
were set as in the first example. Figures 9 and 11 show the histograms and the confidence limits for the 
estimators {a^ sT^^hS t) a'^^ {oc-,5,t,'S-,5,t) corresponding to each example. For the two examples, the 
difference between the order-1 innovation estimator {ah s t ^^li s .t) ^^'^ the adaptive one {a.^s,T,^ ■,&,t) 
decreases when h does it. This is, according Theorem[4l an expected result by assuming that the difference 
between the adaptive and the exact innovation estimators is negligible for (t). ^ thin enough. In addition, 
Table V and VI show the bias of the approximate innovation estimators for these examples. Observe as 
the adaptive {di.,s^T, ^■,s,t) and the order-1 innovation estimator (a)^ s t^^^i s t) with h < 6 provide much 
less biased estimation of the parameters (a, a) than the conventional innovation estimator (S^ s t^^s s t)^ 
which is in fact unable to identify the parameters of the examples. Clearly, this illustrates the usefulness 
of the order-1 innovation estimator and its adaptive implementation. However, as it is shown in Table V 
for 5 = 0.1, no always the adaptive estimator {a.^s,T,^-.5.T) is less unbiased than the order-1 innovation 
estimator {a^ g T^^h s t) fo^' some h < 5. This can happen for one of following reasons: 1) the bias of 
the exact innovation estimator when the adaptive estimator is close enough to it, or 2) an insufficient 
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number of accepted steps of the adaptive estimator for a given tolerance. In our case, since (SJJ s tj'^^ & t) 
converges to (S._5_t, (?-,<5,t) as h decreases (Figure 9 with 5 = 0.1) and the average of accepted steps of the 
adaptive estimators is acceptable (Figure 10 with 6 — 0.1), the first explanation is more suitable. Figures 
10 and 12 show the average of accepted and fail steps of the adaptive estimators at each tk € {t}T/5,T 
for each example. Note how the average of accepted steps corresponding to the estimators from samples 
with i5 = 0.1 is ten time lower than that of the estimators from samples with (5 = 1, which is an expected 
result as well. 

5.4 Simulations with noise free observation equations 

In section 13.41 the connection among the innovation and quasi-maximum likelihood estimators was early 
mentioned for the identification of models with noise free complete observations. In this situation, it 
is easy to verify that the LL-based innovation estimator (|29p reduces to the LL-based quasi-maximum 
likelihood estimator introduced in Jimenez (2012b). In that paper, the state equations of the four models 
considered in Section 15.11 were also used as test examples in simulations. The reader interested in this 
identification problem is encouraged to consider these simulations. 

6 Conclusions 

An alternative approximation to the innovation method was introduced for the parameter estimation 
of diffusion processes given a time series of partial and noisy observations. This is based on a conver- 
gent approximation to the first two conditional moments of the innovation process through approximate 
continuous-discrete filters of minimum variance. For all given data, the convergence of the approximate 
innovation estimators to the exact one was proved when the error between the approximate and the exact 
linear minimum variance filters decreases. It was also demonstrated that, for an increasing number of 
observations, the approximate estimators are asymptotically normal distributed and their bias decreases 
when the above mentioned error does it. As particular instance, the order-/3 innovation estimators based 
on Local Linearization filters were proposed. For them, practical algorithms were also provided and 
their performance in simulation illustrated with various examples. Simulations shown that: 1) with thin 
time discretizations between observations, the order-1 innovation estimator provides satisfactory approx- 
imations to the exact innovation estimator; 2) the convergence of the order-1 innovation estimator to 
the exact one when the maximum stepsize of the time discretization between observations decreases; 
3) with respect to the conventional innovation estimator, the order-1 innovation estimator gives much 
better approximation to the exact innovation estimator, and has less bias and higher efficiency; 4) with 
an adequate tolerance, the adaptive order-1 innovation estimator provides an automatic, suitable and 
computational efficient approximation to the exact innovation estimator; and 5) the effectiveness of the 
order-1 innovation estimator for the identification of SDEs from a reduced number of partial and noisy 
observations. 
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7 Appendix 

According to Jimenez (2012b), given the filters values yt^/tj. and 'Pt^./t^^ the predictions yt/tk ai^d Pt/t^ 
of the order-/3 LL filter are computed by the recursive formulas 

rif — 1 
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and 



rit-l 



vec 



(41) 



for all t e {tk,tk+i] and tk,tk+i e {t}M, where 

nj = max{n = 0, 1. . . . : r„ < f and r„ € 
and the vector u^^t;, and the matrices M(r), Li, L2 are defined as 



M(t) 



and 

in terms of the matrices and vectors 

m 

A{t) = A(r)©A(T) + ^B,(r)®Bj(T), 
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Ll = [ 1^2 0^2 


x(2d+7) 




= [ 0dx(d2+d+2) 


Id Odx5 ] 



'+2d+7) 



C(r) = 



A(t) ai(r) A(T)yx,t, +ao(r) 
1 




j^(d+2)x(d+2) 



[0 



lx(d+l) 



Bi{t) = vec{/3,{T))+p^iT)yr/t„ B2{t) = vecip^ir)) + ^^{T)yr/t„ Bs{t) = vecip^ir)), B4(t) = /34(t)L 
and Bsir) = P^iT)L with 



/32(t 
/33(t 

/35(t 



5^b,,o(r)bToW 

i=l 
m 

EKo(T)bTi(T)+Ki(r)bTo(r) 

i=l 
m 

m 

ao (r) ® ao (r) + E bi,o (r ) ® Bi (r) + Bj (r ) ® bi,o (r ) 

i=l 

m 

ai(T) © ai(T) + ^bi,i(r) O Bi(T) + Bi(r) ® bi,i(T), 



L = [ Id 0^x2], and the d-dimensional identity matrix I^. Here, 



A(r) 



5f('r,yr/tJ 



and Bj(T) 



5gi(r,yr/tJ 



ay ay 

are matrices, and the vectors ao(r„j), ai(r„j), bj o('''„t) and b^ i(r„J satisfy the expressions 

^^(t^Tnt) = ao(r„J+ai(r„J(i-r„J and bf(t;r„J = bi,o(T„J + bi,i(T„J(f - r„J 
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for all t e [tk,tk+i] and t„j S where 

aHt;r) + i E [G(r,y./,JGT(r,y./,JP^' ^!5l^(t-r) for /? = 2 



and 



bf(i;r) 



g.(r,y(r)) - 'f^yr/.. + ^^^^^(t - r) for = 1 

bKt;r) + i E [G(r,y./,JGT(r,y,/,Jp ^^ffiff-)) ft - r) for /3 = 2 



are the order-/3 Ito- Taylor expansions for the drift and diffusion coefficients of ([T]) in the neighborhood 
of (r, y^y^^), respectively, and G = [g-^, . . . , g„i] is an d x m matrix function. The symbols vec, ® and (8) 
denote the vectorization operator, the Kronecker sum and product, respectively. 

From computational viewpoint, each evaluation of the formulas (|40l) - (j4ip at t„ requires the compu- 
tation of just one exponential matrix whose matrix depends of the drift and diffusion coefficients of ([T]) 
at (Tn-i,yr„_i/tk)- This exponential matrix can the efficiently computed through the well known Fade 
method (Moler & Van Loan, 2003) or, alternatively, by means of the Krylov subspace method (Moler & 
Van Loan, 2003) in the case of high dimensional SDEs. Even more, low order Fade and Krylov methods 
as suggested in Jimenez & de la Cruz (2012) can be used as well for reducing the computation cost, but 
preserving the order-/3 of the approximate moments. Alternatively, simplified formulas for the moments 
can be used when the equation to be estimate is autonomous or has additive noise (see Jimenez, 2012a). 
All this makes simple and efficient the evaluation of the approximate moments yt^^^/fj. and "Vf^^^/t^ 
required by the innovation estimator (P^. 
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Figure 1: Histograms and confidence limits for the exact (a^ and the conventional (q!5,t) innovation 
estimators of a computed from the Example 1 data with sampling period S and time interval of length 
T. 
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Figure 2: Histograms and confidence limits for tlie exact (Bg rp) and the conventional {os,t) innovation 
estimators of a computed from the Example 1 data with sampling period 5 and time interval of length 
T. 
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Figure 3: Histograms and confidence limits for the oder-1 innovation estimators of a and a computed on 
uniform (t)^ ^ and adaptive (r). ^ time discretizations from the Example 1 data with sampling period 
6 = 1 and time interval of length T. 
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Figure 4; Average (*) and 90% confidence limits (-) of accepted and failed steps of the adaptive innovation 
estimator at each tk G {i}N in the Examples 1 and 2. 
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Figure 5: Histograms and confidence limits for the exact (ag rp) and the conventional (S^^t) innovation 
estimators of a computed from the Example 2 data with sampling period d and time interval of length 
T. 
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Figure 6: Histograms and confidence limits for the exact (Bg j.) and the conventional (ct5,t) innovation 
estimators of a computed from the Example 2 data with sampling period 6 and time interval of length 
T. 
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Figure 7: Histograms and confidence limits for the exact rp) and the conventional (jig -p) innovation 
estimators of p computed from the Example 2 data with sampling period 5 and time interval of length T. 
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Figure 8a: Histograms and confidence limits for the oder-1 innovation estimators of a and a computed 
on uniform (r)^ ^ and adaptive (r). ^ time discretizations from the Example 2 data with sampling period 
(5 = 0.1 and time interval of length T. 
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Figure 8b: Histograms and confidence limits for the oder-1 innovation estimators of p computed on 
uniform (r)^ j, and adaptive (r) j, time discretizations from tfie Example 2 data with sampling period 
(5 = 0.1 and time interval of length T. 
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Figure 9: Histograms and confidence limits for the oder-1 innovation estimators of a and a computed on 
uniform (t)^ rp and adaptive (r). j, time discretizations from the Example 3 data with sampling period S 
and time interval of length T = 30. 
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Figure 10: Average (*) and 90% confidence limits (-) of accepted and failed steps of the adaptive innovation 
estimator at each tk G {i]N in the Example 3. 
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Figure 11: Histograms and confidence limits for the oder-1 innovation estimators of a and a computed 
on uniform (r)J^ j, and adaptive (r) ^ time discretizations from the Example 4 data with sampling period 
5 and time interval of length T = 30 
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Figure 12: Average (*) and 90% confidence limits (-) of accepted and failed steps of the adaptive innovation 
estimator at each tk € {t} n in the Example 4. 
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